An Algorithm Based on Horizontal Bit Vectors for Mining Frequent Patterns in Data Streams
نویسندگان
چکیده
Most algorithms for mining frequent patterns in data streams are based on structures like FP-tree, complex mining method makes time and storage space large compared to the bit vector expression. In this paper, an algorithm based on Horizontal Bit vectors for mining Frequent Patterns in data Streams HB-FPS is proposed. HB-FPS is divided into two phases, in online phase, it uses bit vectors to horizontally express all the transactions according to whether an item occurs in them, bit value 1 means occurrence, and bit value 0 means the opposite. In offline phase, HB-FPS starts from the biggest item, first mines all the frequent 2-itemsets that contain the item, and then generates candidate k-itemsets by frequent (k-1)-itemsets to growth mine all the frequent patterns by the item unit group. Experiments show that, HB-FPS has high efficiency and good scalability. Theory analysis also indicates that is has a good space overhead.
منابع مشابه
Mining Frequent Patterns in Uncertain and Relational Data Streams using the Landmark Windows
Todays, in many modern applications, we search for frequent and repeating patterns in the analyzed data sets. In this search, we look for patterns that frequently appear in data set and mark them as frequent patterns to enable users to make decisions based on these discoveries. Most algorithms presented in the context of data stream mining and frequent pattern detection, work either on uncertai...
متن کاملAn Efficient Approach for Mining Fault-Tolerant Frequent Patterns Based on Bit Vector Representations
In this paper, an algorithm, called VB-FT-Mine (Vectors-Based Fault–Tolerant frequent patterns Mining), is proposed for mining fault-tolerant frequent patterns efficiently. In this approach, fault–tolerant appearing vectors are designed to represent the distribution that the candidate patterns contained in data sets with fault-tolerance. VB-FT-Mine algorithm applies depth-first pattern growing ...
متن کاملFrequent Pattern Mining from Dense Graph Streams
As technology advances, streams of data can be produced in many applications such as social networks, sensor networks, bioinformatics, and chemical informatics. These kinds of streaming data share a property in common—namely, they can be modeled in terms of graph-structured data. Here, the data streams generated by graph data sources in these applications are graph streams. To extract implicit,...
متن کاملAn Efficient Mining Algorithm by Bit Vector Table for Frequent Closed Itemsets
Mining frequent closed itemsets in data streams is an important task in stream data mining. In this paper, an efficient mining algorithm (denoted as EMAFCI) for frequent closed itemsets in data stream is proposed. The algorithm is based on the sliding window model, and uses a Bit Vector Table (denoted as BVTable) where the transactions and itemsets are recorded by the column and row vectors res...
متن کاملA Single-scan Algorithm for Mining Sequential Patterns from Data Streams
Sequential pattern mining (SPAM) is one of the most interesting research issues of data mining. In this paper, a new research problem of mining data streams for sequential patterns is defined. A data stream is an unbound sequence of data elements arriving at a rapid rate. Based on the characteristics of data streams, the problem complexity of mining data streams for sequential patterns is more ...
متن کامل